Skip to content

Data compression for screen data#81

Open
schewskone wants to merge 34 commits intosensorium-competition:mainfrom
schewskone:compressed_data
Open

Data compression for screen data#81
schewskone wants to merge 34 commits intosensorium-competition:mainfrom
schewskone:compressed_data

Conversation

@schewskone
Copy link
Collaborator

Implemented support for compressed screen formats

  • Added EncodedImageTrial and EncodedVideoTrial which feature get_data_() methods that support encoded formats.
  • Extended screen tests and screen data generation to feature encoded data in mp4 and jpeg format.
  • Thinned requirements.txt and added ffmpeg installation to CI workflow.

Right now the VideoDecoder decodes the entire video and returns that to the interpolation function. This can be optimized with the help of sliced decoding which would require changes to the interpolation method. @pollytur and I decided we should discuss first and implement it later on if deemed necessary.

@pollytur pollytur requested a review from Copilot July 3, 2025 10:07
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for compressed screen data by extending trials to handle encoded images and videos, updating tests and data generation to work with JPEG/MP4, and adjusting interpolation interfaces and CI.

  • Introduce EncodedImageTrial and EncodedVideoTrial with compressed data loaders
  • Enhance test utilities and screen interpolation tests for both encoded and raw formats
  • Update interpolation API to return only data, adjust downstream consumers, and add FFmpeg to CI

Reviewed Changes

Copilot reviewed 8 out of 9 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
tests/test_sequence_interpolator.py Update tests for new single-array return value of interpolate
tests/test_screen_interpolator.py Add parameterized tests for encoded vs. raw screen data
tests/create_screen_data.py Extend data generation to save JPEG/MP4 and emit metadata
experanto/interpolators.py Remove valid returns, introduce image_names flag, add encoded trials
experanto/experiment.py Adjust interpolate to drop valid output
experanto/datasets.py Update dataset pipelines to match new return signature
configs/default.yaml Add image_names configuration
.github/workflows/test.yml Install FFmpeg for encoding dependencies
Comments suppressed due to low confidence (4)

experanto/interpolators.py:152

  • The return type annotation still indicates a tuple, but the method now returns only a single array. Update the type hint and docstring to reflect the new signature.
    def interpolate(self, times: np.ndarray) -> tuple[np.ndarray, np.ndarray]:

experanto/interpolators.py:425

  • [nitpick] Using format shadows the built-in Python function. Consider renaming this variable to file_format to avoid confusion.
            format = metadata.get("file_format")

tests/create_screen_data.py:14

  • The new encoded parameter isn't documented. Please update the function docstring to explain what encoded does and what formats it controls.
def create_screen_data(

tests/create_screen_data.py:118

  • The code calls shutil.rmtree but shutil is not imported. Add import shutil at the top of the module.
        shutil.rmtree(SCREEN_ROOT)

Copilot AI review requested due to automatic review settings January 27, 2026 21:44
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Copilot AI review requested due to automatic review settings January 27, 2026 21:50
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 10 changed files in this pull request and generated 16 comments.

Comments suppressed due to low confidence (1)

experanto/interpolators.py:826

            self._cached_data = self.get_data_()

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +912 to +927
def __init__(
self, data_file_name, meta_data, shared_decoder=None, cache_data: bool = False
) -> None:
super().__init__(
data_file_name,
meta_data,
tuple(meta_data.get("image_size")),
meta_data.get("first_frame_idx"),
meta_data.get("num_frames"),
cache_data=cache_data,
)
self.video_decoder = shared_decoder
if self.video_decoder is None:
raise ValueError(
"EncodedVideoTrial requires a shared_decoder to be provided."
)
Copy link

Copilot AI Jan 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a critical initialization order bug. The parent class ScreenTrial.init (called via super()) will attempt to cache data by calling get_data_() at line 826 if cache_data=True. However, self.video_decoder is only set after super().init() completes (line 923). This means get_data_() will be called before video_decoder is initialized, causing the check at lines 924-927 to raise a ValueError. The video_decoder should be set before calling super().init(), or caching should be disabled for EncodedVideoTrial since it requires frame_indices.

Copilot uses AI. Check for mistakes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@schewskone - just to be sure - have you checked this does not actually happen?
(in default configs _cache_data is set to False, so we might have missed it)

@schewskone schewskone requested a review from pollytur January 29, 2026 15:58

frames = np.random.rand(n_frames, *frame_shape).astype(np.float32)
# Generate frames with values in [0, 255] for better encoding
frames = (np.random.rand(n_frames, *frame_shape) * 255).astype(np.uint8)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@schewskone why is it now np.uint8 and not float32?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get not all of the asserts fro valid should be removed here.
@schewskone could you please take a look and plug some of them back, f.e. line 37 and 41-43 probably should stay, same for 71 and 75-77

(I merged it with the current main - check if you disagree with stn)

file_ext = ".jpg"
else:
# Save as numpy array (original behavior)
np.save(data_dir / f"{i:05d}.npy", frames[i].astype(np.uint8))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@schewskone it used to be np.float32 here in test as well - why have you changed it and does it match the real exported data?

Also, why there are no grayscale .jpeg saved? (e.g. 1 channel ones)

# Initialize the decoder only once per unique file
# Assuming ScreenTrial.create or a helper can return just a decoder
if self.device == "cuda":
with set_cuda_backend("beta"):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@schewskone could you please add a comment here what set_cuda_backend("beta") does? and is it necessary to have it here because moving decoded data from CPU to GPU after interpolating is much slower? have you profiled it?


# 2. Logic to handle shared video decoders
decoder_to_use = None
if file_format in [".mp4", ".avi", ".mov"]: # Add your video formats here
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@schewskone should we check that a file_format is .npy otherwise and throw a NonImplemented error is not?

if data_file_name not in shared_decoders:
# Initialize the decoder only once per unique file
# Assuming ScreenTrial.create or a helper can return just a decoder
if self.device == "cuda":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is device cuda:0? I would suggest checking if device contains cuda

Comment on lines +495 to +497
def _initialize_decoder(self, data_file_name):
decoder = VideoDecoder(
str(data_file_name),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@schewskone why do you need to cast it to str? could it be sth else?

Suggested change
def _initialize_decoder(self, data_file_name):
decoder = VideoDecoder(
str(data_file_name),
def _initialize_decoder(self, data_file_name: str):
decoder = VideoDecoder(
data_file_name,

) -> Union[tuple[dict, dict], dict, tuple[np.ndarray, np.ndarray], np.ndarray]:
if device is None:
values = {}
valid = {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be returned back!

)


class EncodedImageTrial(ScreenTrial):
Copy link
Contributor

@pollytur pollytur Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@schewskone where is it actually used? I cannot find it in being called explicitly but I might be missing sth (the only place I see it could be called now is lines 782-783)

cls = globals()[class_name]

# Pass shared_decoder only for EncodedVideoTrials
if cls is EncodedVideoTrial:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@schewskone could we please add a small comment here why EncodedImageTrial don't need a shared decoder?


def get_data_(self) -> np.array:
"""Override base implementation to load compressed images"""
img = cv2.imread(str(self.data_file_name)) # returns BGR
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess here even if the original .jpg is 1 channel (grayscale) image - cv2.imread would upload 3 duplicated channels. Do we want to catch it somehow, @schewskone (Warnings? or set a flag to cast things to 1 channel if the user knows their jpg are grayscale?)?

if img is None:
raise ValueError(f"Could not read image file: {self.data_file_name}")
# Convert BGR to RGB
img = img[:, :, [2, 1, 0]]
Copy link
Contributor

@pollytur pollytur Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@schewskone why not cv2.cvtColor(img, cv2.COLOR_BGR2RGB)? I'm not 100% sure but quick googling says that cv2 should be faster (both create a contagious copy of the image though)


def get_data_(self, frame_indices=None) -> np.array:
"""Override base implementation to load compressed videos"""
# Frame_indices is only ever None for caching purposes, we then decode entire video and cache it
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we maybe add an assert then, that if self._cached_data is None then frame_indices should not be None?

Comment on lines +879 to +881
frames = frames[:, [2, 1, 0], ...]
# Reorder dimensions
frames = frames.permute(0, 2, 3, 1).contiguous() # T,H,W,C
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not to do it together? because it looks like some dimensions end up being moved twice

"EncodedVideoTrial requires a shared_decoder to be provided."
)

def get_data(self, frame_indices) -> np.array:
Copy link
Contributor

@pollytur pollytur Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def get_data(self, frame_indices) -> np.array:
def get_data(self, frame_indices) -> np.ndarray:

Comment on lines +550 to +552
out = out.transpose(
0, 3, 1, 2
) # transform into (T, C, H, W) after finishing with Cv2 operation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@schewskone should this reordering really be here? because it has not been here before. is it back compatible with the numpy files?

Copy link
Contributor

@pollytur pollytur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please look at the comments - at least test need to plug back valid checks for sure

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants